2 research outputs found

    Bayesian spatial modeling of malnutrition and mortality among under-five children in sub-Saharan Africa.

    Get PDF
    Doctoral Degree. University of KwaZulu-Natal, Pietermaritzburg.The aim of this thesis is to develop and extend Bayesian statistical models in the area of spatial modeling and apply them to child health outcomes, with particular focus on childhood malnutrition and mortality among under-five children. The easy availability of a geo-referenced database has stimulated a paradigm shift in methodological approaches to spatial analysis. This study reviewed the spatial methods and disease mapping models developed for areal (lattice) data analysis. Observational data collected from complex design surveys and geographical locations often violates the independent assumption of classical regression models. By relaxing the restrictive linearity and normality assumptions of classical regression models, this study first developed a flexible semi-parametric spatial model that accommodates the usual fixed effect, nonlinear and geographical component in a unified model. The approach was explored in the analysis of spatial patterns of child birth outcomes in Nigeria. The study also addressed the issue of disease clustering, which is of interest to epidemiologists and public health officials. The study then proposed a Bayesian hierarchical analysis approach for Poisson count data and formulated a Poisson version of generalized linear mixed models (GLMMs) for analyzing childhood mortality. The model simultaneously addressed the problem of overdispersion and spatial dependence by the inclusion of the risk factors and random effects in a single model. The proposed approach identified regions with elevated relative risk or clustering of high mortality and evaluated the small scale geographical disparities in sub-populations across the regions. The study identified another challenge in spatial data analysis, which are spatial autocorrelation and model misspecification. The study then fitted geoadditive mixed (GAM) models to analyze childhood anaemia data belonging to a family of exponential distributions (Gaussian, binary and multinomial). The GAM models are extension of generalized linear mixed models by allowing the inclusion of splines for continuous covariate (or time) trends with the parametric function. Lastly, the shared component model originally developed for multiple disease mapping was reviewed and modified to suit the binary data at hand. A multivariate conditional autoregressive (MCAR) model was developed and applied to jointly analyze three child malnutrition indicators. The approach facilitated the estimation of conditional correlation between the diseases; assess the spatial association with the regions and geographical variation of individual disease prevalence. The spatial analysis presented in this thesis is useful to inform health-care policy and resource allocation. This thesis contributes to methodological applications in life sciences, environmental sciences, public health and agriculture. The present study expands the existing methods and tools for health impact assessment in public health studies. KEYWORDS: Conditional Autoregressive (CAR) model, Disease Mapping Models, Multiple Disease mapping, Health Geography, Ecology Models, Spatial Epidemiology, Childhood Health outcomes

    Empirical statistical modelling for crop yields predictions: bayesian and uncertainty approaches

    Get PDF
    Includes bibliographical referencesThis thesis explores uncertainty statistics to model agricultural crop yields, in a situation where there are neither sampling observations nor historical record. The Bayesian approach to a linear regression model is useful for predict ion of crop yield when there are quantity data issue s and the model structure uncertainty and the regression model involves a large number of explanatory variables. Data quantity issues might occur when a farmer is cultivating a new crop variety, moving to a new farming location or when introducing a new farming technology, where the situation may warrant a change in the current farming practice. The first part of this thesis involved the collection of data from experts' domain and the elicitation of the probability distributions. Uncertainty statistics, the foundation of uncertainty theory and the data gathering procedures were discussed in detail. We proposed an estimation procedure for the estimation of uncertainty distributions. The procedure was then implemented on agricultural data to fit some uncertainty distributions to five cereal crop yields. A Delphi method was introduced and used to fit uncertainty distributions for multiple experts' data of sesame seed yield. The thesis defined an uncertainty distance and derived a distance for a difference between two uncertainty distributions. We lastly estimated the distance between a hypothesized distribution and an uncertainty normal distribution. Although, the applicability of uncertainty statistics is limited to one sample model, the approach provides a fast approach to establish a standard for process parameters. Where no sampling observation exists or it is very expensive to acquire, the approach provides an opportunity to engage experts and come up with a model for guiding decision making. In the second part, we fitted a full dataset obtained from an agricultural survey of small-scale farmers to a linear regression model using direct Markov Chain Monte Carlo (MCMC), Bayesian estimation (with uniform prior) and maximum likelihood estimation (MLE) method. The results obtained from the three procedures yielded similar mean estimates, but the credible intervals were found to be narrower in Bayesian estimates than confidence intervals in MLE method. The predictive outcome of the estimated model was then assessed using simulated data for a set of covariates. Furthermore, the dataset was then randomly split into two data sets. The informative prior was later estimated from one-half called the "old data" using Ordinary Least Squares (OLS) method. Three models were then fitted onto the second half called the "new data": General Linear Model (GLM) (M1), Bayesian model with a non-informative prior (M2) and Bayesian model with informative prior (M3). A leave-one-outcross validation (LOOCV) method was used to compare the predictive performance of these models. It was found that the Bayesian models showed better predictive performance than M1. M3 (with a prior) had moderate average Cross Validation (CV) error and Cross Validation (CV) standard error. GLM performed worst with least average CV error and highest (CV) standard error among the models. In Model M3 (expert prior), the predictor variables were found to be significant at 95% credible intervals. In contrast, most variables were not significant under models M1 and M2. Also, The model with informative prior had narrower credible intervals compared to the non-information prior and GLM model. The results indicated that variability and uncertainty in the data was reasonably reduced due to the incorporation of expert prior / information prior. We lastly investigated the residual plots of these models to assess their prediction performance. Bayesian Model Average (BMA) was later introduced to address the issue of model structure uncertainty of a single model. BMA allows the computation of weighted average over possible model combinations of predictors. An approximate AIC weight was then proposed for model selection instead of frequentist alternative hypothesis testing (or models comparison in a set of competing candidate models). The method is flexible and easy to interpret instead of raw AIC or Bayesian information criterion (BIC), which approximates the Bayes factor. Zellner's g-prior was considered appropriate as it has widely been used in linear models. It preserves the correlation structure among predictors in its prior covariance. The method also yields closed-form marginal likelihoods which lead to huge computational savings by avoiding sampling in the parameter space as in BMA. We lastly determined a single optimal model from all possible combination of models and also computed the log-likelihood of each model
    corecore